Power-efficient Hierarchical Data Aggregation using 

Compressive Sensing in WSN 



Xi Xu 

Department of Electrical and 
Computer Engineering 
University of Illinois at Chicago 
Chicago,Illinois,60607 
Email :xxu25 @ uic.edu 



Rashid Ansari 
Department of Electrical and 
Computer Engineering 
University of Illinois at Chicago 
Chicago,Illinois,60607 
Email: ransari@uic.edu 



Ashfaq Khokhar 
Department of Electrical and 
Computer Engineering 
University of Illinois at Chicago 
Chicago,Illinois,60607 
Email : ashfaq @ uic . edu 



Abstract — Compressive Sensing (CS) method is a burgeoning 
technique being applied to diverse areas including wireless sensor 
networks (WSNs). In WSNs, it has been studied in the context of 
data gathering and aggregation, particularly aimed at reducing 
data transmission cost and improving power efficiency. Existing 
CS based data gathering work |2| |4| |5| in WSNs assume fixed 
and uniform compression threshold across the network, regard- 
less of the data field characteristics. In this paper, we present a 
novel data aggregation architecture model that combines a multi- 
resolution structure with compressed sensing. The compression 
thresholds vary over the aggregation hierarchy, reflecting the 
underlying data field. Compared with previous relevant work, 
the proposed model shows its significant energy saving from 
theoretical analysis. We have also implemented the proposed CS- 
based data aggregation framework on a SIDnet SWANS platform, 
discrete event simulator commonly used for WSN simulations. 
Our experiments show substantial energy savings, ranging from 
37% to 77% for different nodes in the networking depending on 
the position of hierarchy. 

Index Terms — Data Aggregation, Compressive Sensing, Hier- 
archy, Power Efficient Algorithm, Wireless Sensor Network 

I. Introduction 

Data aggregation [ 1 ] is a key task performed within sensor 
networks to fuse information from multiple sensors and deliver 
it to a sink node in a manner that eliminates redundancy 
and enables energy saving. The redundancy is a consequence 
of the correlation inherent in smooth data fields, such as 
temperature, pressure, and sound measurements, in practical 
applications that include surveillance and habitat monitoring. 
Suppose a sensor transfers its single measurement to the sink 
over TV — 1 intermediate sensors along a routing path. Each 
intermediate sensor combines the data it receives with its own 
data and forwards it along the route. This data aggregation 
process usually involves 0(N 2 ) data transmission. However 
when there is redundancy in data then it lends itself to sparse 
representations and in-network compression thereby yielding 
energy savings in the information transfer. 

Recently the use of compressive data gathering has been 
examined [2], and shown to reduce transmission requirements 
to 0(N * M), where M represents the number of random 
measurements, and M << N. If we ignore temporal change 
and only consider data in a certain time snapshot, then each 
sensor only has one measurement. According to compressive 



sensing (CS) theory (5), the Compressive Data Gathering 
(CDG) method (2) requires all the sensors to collectively 
provide at the sink at least M = KlogN measurements to 
fully recover the signal, where K is the sparsity of signal. We 
note that when the CDG method is applied in a large scale 
network, M may still be a large number. Moreover in the 
initial data aggregation phase in [2], leaf nodes unnecessarily 
transmit M measurements, which is in excess of their sensed 
data and therefore introduces redundancy in data aggregation. 
Recognizing this, the hybrid CS aggregation (4) (5j method 
proposed an amalgam of the merits of non-CS aggregation 
and plain CS aggregation. It optimized the data aggregation 
cost by setting a threshold M and applying CS aggregation 
when data gathered in a sensor equals or exceeds M. The 
data transmission cost and hence the energy consumption is 
reduced. However, we observe that only a small fraction of 
sensors utilize CS aggregation method and the transmission 
measurement number M = KlogN for those nodes using 
CS aggregation method is still large. 

Our work here shows significant improvement is possible 
and it stems from a hierarchical clustering architecture that 
we propose. The central idea is to configure sensor nodes 
such that instead of one sink node being targeted by all 
sensors, several nodes are designated for intermediate data 
collection and concatenated to yield a hierarchy of clusters 
at different levels. The use of the hierarchical architecture 
reduces the measurement number in the algorithm (4) (5j 
for CS aggregation since in the new architecture it is based 
on the cluster size rather than the global sensor network 
size TV. In this paper, we propose a novel CS-based data 
aggregation hierarchical architecture over the sensor network 
and investigate its performance in terms of data rate and 
energy savings. To the best of our knowledge, we are the 
first to investigate compressive sensing method for hierarchical 
data aggregation in sensor network. We refer to our method 
as Hierarchical Data Aggregation using Compressive Sensing 
(HDACS). 

The proposed data aggregation architecture distributes the 
workload of one sink to all the sensors, which is crucial for 
balancing energy consumption over the whole network. In this 
paper we also perform a theoretical analysis of the data trans- 
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Fig. 1. CS data aggregation Architecture 



mission requirements and energy consumption in HDACS. We 
implement our proposed architecture on a SIDnet-SWANS 
simulation platform and test different sizes of two-dimensional 
randomly deployed sensor network. The results validate our 
theoretical analysis. Substantial energy savings are reported 
for a large portion of sensors on the different hierarchical 
positions, ranging from 50% to 77% when compared with (2j, 
and from 37% to 70% when compared with (4) (5J. 

II. Proposed Data Aggregation Architecture 
A. Model and analysis 

The main idea behind this new architecture is that all sensors 
will no longer aim at flowing their data into one sink. Instead, 
plenty of collecting clusters have been concatenated forming 
different types of clusters in different levels. The data flows 
from the source node through the architecture to the sink. 

Suppose N sensors have been uniformly and randomly 
deployed in a 2D square space with area S. Let s be the unit 
area in the lowest level and the clusters have been defined in 
a multi-resolution way with highest level T. In each level i, 
we define: 

• : the area of I th cluster 

• q : the I th cluster head in the network 

• N^: the number of sensors in one cluster 

• M^: the transmission number of measurements for I th 
cluster 

. d[ l) : the sum of distances between cluster head c-^ and 
its children nodes. 

• 7^ : the ratio of transmitting data size and receiving data 
size. 

• E^: the transmission energy cost in the I th cluster 

• Ci\ the collection of cluster heads 

• \Ci\ :the number of cluster heads 

• Mf. the sum of measurements for transmission within 
network 

• Ei. the transmission energy cost for all the clusters 

Here, d is defined as the collection of all the clusters, which 

j T ota i transmission mea- 



1) 2D uniformly and randomly network deployment and 
analysis : In order to simplify the model analysis and get 
quantitative comparisons with previous work, we put some 
constraints on our CS data aggregation architecture. Figure 
[T] shows the logical tree of clustering configuration, which 
consists of identical n nodes in level i for i > 2 and random 
leaf nodes > n. Consider a N nodes network, where 
N = N f + n T . Therefore, we have the following formula: 
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Besides, in level i, is the sum of distances be- 
tween cluster head q ^ and its children cluster heads 

((Z-l)*n+l) ((Z-l)*n+2) ^((Z-l)*n+n) 
C i- \ > C i- \ 



in the level i-1. 
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The number of cluster heads is \d\ 

cluster s^P will be the same as that of all the other clusters at 
that level. We denote them as Si which combines n subregions 
from i-1 level, and it satisfies the relation of Si = n * s^-i. 

We distribute the sensors in a 2D randomly deployed 
network with some constraints. 

• There will be at least n nodes in each cluster in level 
1. This property requires we have to maximize the 
probability of n nodes in one cluster: 

j 
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It requires to minimize N f /n T . From historical expe- 
riences, we are prone to set up n = 4 a for a G 
to guarantee the full coverage of the whole region for 
each clusters with square area without producing inter- 
sections between two neighboring clusters. Therefore, 
T = [log^ J . And we get the minimum of N' /n T . 
The remaining sensors are uniformly and randomly dis- 
tributed in \C\\ clusters. So we set to maximize the 
probability of of N' nodes in \C\\ clusters. 
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This has already been achieved in the constrains ([T]). 

The main advantage of this network deployment is that it 
is based on 2D randomly deployed network topology, which 
corresponds to practical sensors distribution. It also addresses 
issues when the the condition that N = n T is not met. Besides, 
the number of cluster heads will be at most n T_1 , and the leaf 
nodes will be A^ — n T_1 > n T — n T_1 = (n — l)n T_1 . If 
n > 2, N— n T_1 > n T_1 . This result implies that only a small 
number of nodes will be involved with multiple level data 
processing and aggregation. The only job of other sensors is 
just sending their data directly to the cluster head. The balance 
in load distribution is achieved by randomly choosing different 
cluster heads in each duty circle. 

2 ) The process of CS data aggregation: In the initial phase, 

r(0 
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cluster head c(\ which adopts the same strategy as paper Q 



|5| so as to reduce CS data aggregation redundancy, com- 
pressed them into Mi = K * log N]_ random measurements. 



In level i ( i > 2), the I th cluster head cf^ receives M- 
random measurements, where j G [(/ — l)*n+l, (I — l)*n+n] 
from its children cluster head c^\. It performs CS recovery 
algorithm to reconstruct the redundant data. After accumulat- 
ing all the data from their children nodes, the cluster head 
takes M,P random measurements of the signal and send 
them to its parent cluster head in the level i+1. According 
to compressive sensing theory, for signal with sparsity K, the 
random measurements 0(K log N) will be enough to fully 
represent original signal with cardinality N. We adopt the 
method in (4| (5| to set up threshold M = K * log N to 
optimize the data transmission size. We propose to set up 
multiple thresholds M- l>} = K * log with upper bound 
O^logA^) in the top level T. Mi is a small number when i 
is small. This property helps to reduce the data transmission 
number and hence significantly save energy. 

3) Parameters Analysis: For n T+1 > N = n T + N' > 
n T , we get n z+1 > > n l and measurements M^ in 

the range of [iiflogn, (i + l)K\ogn]. The total transmission 
measurements M for the whole data aggregation task is: 

T T \Ci\ 

m = J2 Mi =J2J2 M ^ l) 
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Let Si = Y^f=i an d and ^2 = ^2i = i ^r we g et me closed 
form of Si: 
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Therefore, the lower bound of data transmission number M is: 

Q(M) = N — n T_1 + K(n - l)n T_1 logn^i 
and upper bound is 

O(M) = N — n T ~ x + K(n - l)n T_1 logn(5i + S 2 ) 

On the other hand if data is sent using the same data archi- 
tecture, the total measurements with the plain or non-hybrid 
CS (NCS) algorithm in paper |2 ] is: M NC s = N * K * log N. 
In paper (4) (5), the total measurements for hybrid CS (HCS) 
algorithm is: M HCS = E!?iW } - 1) + EL EEi (™ " 
l)K\ogN = N - n T - x + K(n - ljn^ 1 log NS 2 . In the 
following analysis, we assume the sparsity K as unity to rule 
out the effects from data field for data aggregation comparison. 
Figure [2] shows the quantitative comparison of total data 
transmission measurements with cluster size n = 4, 16, 64 for 
proposed HDACS method, NCS data aggregation (2j|and HCS 
method [4] [5 ] with 1024 sensor nodes. From figure |2j we find 
the bigger the cluster size is, the less measurements needed for 



Fig. 2. Measurements Comparison for 1024 Sensor Networks 



(a) Cluster Size 4 



(b) Cluster Size 16 
Fig. 3. Measurement Comparison for Different Sizes of Sensor Networks 



data transmission. However, this theoretical analysis does not 
consider the realistic routing protocol underlying the network 
architecture in the lower layer. Simply expanding the cluster 
size within local cluster and all the nodes forward their sensed 
data into cluster head directly, which definitely will lead to 
severe data flooding and data loss. Therefore, cluster size 
will be fixed as 4 and 16 in the following analysis. Figure 
[3] shows total data transmission measurements changes with 
increase of sensor nodes under these two fixed cluster sizes. 
From the figure, we observe that the NCS method introduces a 
large number of data redundancy. The measurements required 
by HCS method is a little worse than proposed method, but 
we need to point out that this comparison is based on the 
premise that the data is propagated on the muli-resolution 
data architecture. Since a lot of sensors are leaf nodes and 
only transmit their raw data to their cluster heads both in the 
proposed method in this paper and HCS method (4) (5) in the 
first level, they lead to very similar result in the theoretical 
analysis. 



(a) Cluster Size 4 



(b) Cluster Size 16 

Fig. 4. Compression ratio change with increase of levels for Different Sizes 
of Sensor Networks 



The data compression ratio ji is calculated as follows: 
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Compression ratio resides in the range of: 7^ G [-(1 + 
^zj), ^(1 + 4)], if 2 > 2. Figure H shows the compression 
ratio changes in each level for NCS method, HCS method, 
and proposed HDACS method with cluster size 4 and 16. The 
figure shows that the NCS method provides no compression 
at all. HCS method yields compression in the first level by 
using compressive sensing method to compress data when the 
number of data reaches the global threshold. The proposed 
hierarchy based method achieves a remarkable compression 
ratio for each level which is well below 0.5. This is very 
appealing as nodes that are spatially close to the central 
sink working as intermediate nodes usually consume more 
energy than other nodes. The energy savings in the task in 
the application layer for those nodes will balance the energy 
consumed in the routing layer, which therefore prolongs the 
lifetime of the network. 

For the energy analysis, we pay more attention to the data 
transmission cost and the cost of receiving data is taken 
as a constant. Transmission energy cost E^ is usually a 



function of transmission distance d!P and data size M} 
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Therefore, has been modeled as E^ 



which leads to the total transmission energy cost in each 
level Ei = Ya^i - Here, c s is a constant startup energy 
consumption for each data transmission task, and c is a 
constant transmission cost for unit data size per unit distance. 



Assumed^ =Eti, (xt , Vt) ^t-x Ci f + (y t -y Ci f]^, 
where (x Ci ,y Ci ) and (x t ,yt) are the location coordinates of 
c|') and its children nodes respectively. In a large dense 
uniformly and randomly distributed sensor network, if i > 1 
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Let S[ = Yh=i in~ 1 / 2 and we get the closed form of S[: 
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Therefore, the lower bound of total energy consumption E is: 

Q(E) = n T - 1 c s (l + S 2 ) + c^7rs 1/2 [(N-n T - 1 ) + K(n-l)n T - 1 S' 1 log n] 

and upper bound of E is: 

O(E) = n T - 1 c s (l+S 2 )+c^7vs 1/2 [(N-n T - 1 )+K(n-l)n T - 1 (S[+S! 2 )logn] 

Follow a similar derivation, we get the transmission energy 
consumption for NCS method in paper [2] with the same data 
aggregation architecture 

E NCS = n T - 1 c s {l+S 2 )+c-^<Ks 1/2 \ogN[{N-n T - 1 )+K{n-l)n T - 1 S' 2 \ 

and energy consumption with HCS method in paper (4j (5) 

Ehcs = n T - 1 c s {l+S 2 )+c^s 1/2 [{N-n T - 1 )+K{n-l)n T - 1 \ozNS' 2 \ 

To ignore the effects of all the constant parameters, we 
assume c s , c, K, s as unity. Figure [5] reflects final transmission 
energy consumption E trend with 300, 400, 500, 600, 700, 
800 network scale for cluster size 4. The proposed HDACS 
method achieves the highest efficiency in energy consumption 
compared with other methods. In the following paper, we set 
up 2D irregular network deployments on Java-based SIDnet- 
SWANS simulation platform to demonstrate feasibility and 
robustness of our hierarchy model. 




B. Implementation procedure 

1) Signal model: A variety of practical applications in 
survelillance and habitat monitoring, the data fields such 
as temperature, sound, pressure measurements are usually 
smooth. In this paper we ignore the effect of variation of 
sparsity K in each level. Therefore, smooth data field with 
uniform noise is a practical choice to get the sparse signal 
representation with identical sparsity K. We perform Discrete 
Cosine Transform (DCT) for each of the collecting clusters 
before taking random measurements. The main reasons for 
choosing DCT are: a). It yields fast vanishing moments of 
signal representation and gives real coefficients unlike Discrete 
Fourier Transform (DFT). b). It also does not require that 
cardinality of measurements be a power of 2 as wavelet 
transform does. We perform the truncating process for DCT 
coefficients by forcing those magnitudes below a threshold to 
zero in order to further sparsify the signals. The threshold has 
been set up by a percentile of the first dominant magnitudes. 
In actual simulation, a is chosen as 0.01, 0.005. 

2) Routing model and recovery algorithm: The multi-scale 
routing protocol matches well with hierarchical data aggrega- 
tion mechanism. Since our model mainly focuses on the dense 
and large-scale network topology, it guarantees the existence 
of shortest path between any two nodes. 

CoSaMP algorithm [6 ] has been adopted as the CS recov- 
ery algorithm in our implementation. This algorithm takes 
y = * $x as a proxy to represent signal inspired by the 
restricted isometry property of compressive sensing. Compared 
with other recovery methods such as various versions of 
OMP (7) (8) algorithms, convex programming methods (9) 
(TO), combinatorial algorithms (TTJ (12), CoSaMP algorithm 
guarantees computation speed and provides rigorous error 
bounds. 

3) Simulation results: SIDnet-SWANS (14) is a sensor 
network simulation environment for various aspects of appli- 
cations, which provides with Java based visual tool, has been 
utilized to study the performance of the proposed algorithm. 
The JiST system, which stands for Java in Simulation Time, 
is a Java-based discrete-event simulation engine. JiST system 
has been used to obtain the transmission time and energy 
consumption for each sensor. Figure [6] is a snapshot of user 
interface of newly designed CS data aggregation architecture 
on SIDnet-SWANS for 400 sensors network. In this section the 



Fig. 6. Simulation User Interface on SIDnet-SWANS platform 
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Fig. 7. Recovery performance for 300,400,500,600,700 sensors 



performance has been evaluated on SIDnet-SWANS platform 
with JiST system to demonstrate all the theoretical analysis 
process . 

The algorithms was tested against five network sizes: 300, 
400, 500, 600, and 700 nodes over a flat data field with 
uniformly distributed additive white noise. In all these net- 
work, we choose n = 4 and T = 3. The leaf nodes number 
Ni in the level one is flexible, which fits the characteristics 
of two-dimensional random deployment of sensor networks. 



Therefore, N = Ni 



_1 . In the recovery procedure, we 



adopt the idea of Model-based CoSaMP fl3| algorithm as the 
DCT representation makes the support location of coefficients 
visible and design a new CoSaMP algorithm for DCT based 
signal ensemble, which accurately recovers the data. We define 
the signal to noise ratio (SNR) as the logarithm of the ratio 
of signal power from each sensors over recovery error in the 
fusion center. As we see from figure [7] the change of sensor 
size does not affect SNR performance. 

Figure [8] shows the comparison of transmission energy 
consumption distribution for 400 sensor networks. Ratio 1 is 
defined as transmission energy consumption ratio of proposed 
HDACS and NCS data aggregation. Ratio2 is defined as 
transmission energy consumption ratio of proposed HDACS 
and HCS data aggregation. As we see from the figure [8j Ratio 1 
is less than 0.5, which means 50% transmission energy will be 
saved compared with NCS data aggregation. Ratio2 is almost 
equal or less than 1, which is owing to the fact that most nodes 
only transmit data in the level one and finish their job. Both 
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Fig. 8. Transmission energy consumption ratio for 400 sensors. Ratiol is 
defined as transmission energy consumption ratio of proposed HDACS and 
NCS data aggregation. Ratio2 is defined as transmission energy consumption 
ratio of proposed HDACS and HCS data aggregation 



proposed HDACS and HCS data aggregation adopt the same 
strategy that only raw data is transmitted for those leaf nodes, 
which explains why most Ratio2 values of nodes are equal 
to one. But for those nodes working as collecting clusters in 
the levels that are higher than one, Ratio2 values are less or 
equal to 0.633 as we expect. The nodes with highest level 
save almost 70% power. Moreover, the results we obtain so 
far depend on the frame size per transmission in MAC layer 
to some extent. If the data size becomes larger, data will be 
segmented into more frames for transmission. And this will 
definitely cost more power. Since the comparison of proposed 
HDACS, NCS and HCS algorithms always refers to compare 
the number of log N{ and log N. Suppose one frame size is 



m, then the frame number of data size N{ and N are 



log Nj 
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are 

T -- 



log N 



respectively. If Ni 



and 



Tlogn 



and two frame number 

2 , n — 4, m — 4 and 



When i — 'A , n 
4, frame number are 1 and 2 respectively, which explains 
how 50% transmission energy is saved by using HDACS data 
aggregation . 

III. Conclusion and future work 

In this paper, we presented a novel power-efficient hierarchi- 
cal data aggregation architecture using compressive sensing for 
a large scale dense sensor network. It was aimed at reducing 
the data aggregation complexity and therefore enabling energy 
saving. The proposed architecture is designed by setting up 
multiple types of clusters in different levels. The leaf nodes in 
the lowest level only transmit the raw data. The collecting 
clusters in other levels perform DCT to get sparse signal 
representation of data from their own and children nodes, take 
random measurements and then transmit them to their parent 
cluster heads. When parent collecting clusters receive random 
measurements, they use inverse DCT transformation and DCT 
model based CoSaMP algorithm to recover the original data. 
By repeating these procedures, the cluster heads in the top 
level will collect all the data. We perform theoretical analysis 
of hierarchical data aggregation model with respect to total 
data transmission number, data compression ratio and trans- 
mission energy consumption. We also implement this model 
on SIDnet-SWANS simulation platform and test different sizes 



of two-dimensional randomly deployed sensor network. The 
results demonstrate the validation of our model. It guarantees 
the accuracy of collecting data from all the sensors. The 
transmission energy is significantly reduced compared with 
the previous work. 

In our future work, we will also take into consideration 
changeable factors of sparsity K. It refers to more complex 
data fields, and adaptive model will be set up to handle 
the dynamic nature of data aggregation fields. Besides, other 
CS recovery algorithms will also be investigated to reduce 
recovery complexity and improve signal recovery accuracy. 
Distributed compressive sensing [16], that factors in the spatial 
correlation of data, turns out to be a very promising recovery 
algorithm. Moreover, other tasks besides data aggregation will 
also be exploited on our proposed hierarchical architecture. 
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